Open Information Extraction with Tree Kernels

نویسندگان

  • Ying Xu
  • Mi-Young Kim
  • Kevin Quinn
  • Randy Goebel
  • Denilson Barbosa
چکیده

Traditional relation extraction seeks to identify pre-specified semantic relations within natural language text, while open Information Extraction (Open IE) takes a more general approach, and looks for a variety of relations without restriction to a fixed relation set. With this generalization comes the question, what is a relation? For example, should the more general task be restricted to relations mediated by verbs, nouns, or both? To help answer this question, we propose two levels of subtasks for Open IE. One task is to determine if a sentence potentially contains a relation between two entities? The other task looks to confirm explicit relation words for two entities. We propose multiple SVM models with dependency tree kernels for both tasks. For explicit relation extraction, our system can extract both noun and verb relations. Our results on three datasets show that our system is superior when compared to state-of-the-art systems like REVERB and OLLIE for both tasks. For example, in some experiments our system achieves 33% improvement on nominal relation extraction over OLLIE. In addition we propose an unsupervised rule-based approach which can serve as a strong baseline for Open IE systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring syntactic structured features over parse trees for relation extraction using kernel methods

Extracting semantic relationships between entities from text documents is challenging in information extraction and important for deep information processing and management. This paper proposes to use the convolution kernel over parse trees together with support vector machines to model syntactic structured information for relation extraction. Compared with linear kernels, tree kernels can effe...

متن کامل

Composite Kernels For Relation Extraction

The automatic extraction of relations between entities expressed in natural language text is an important problem for IR and text understanding. In this paper we show how different kernels for parse trees can be combined to improve the relation extraction quality. On a public benchmark dataset the combination of a kernel for phrase grammar parse trees and for dependency parse trees outperforms ...

متن کامل

Tree Kernel-Based Relation Extraction with Context-Sensitive Structured Parse Tree Information

This paper proposes a tree kernel with contextsensitive structured parse tree information for relation extraction. It resolves two critical problems in previous tree kernels for relation extraction in two ways. First, it automatically determines a dynamic context-sensitive tree span for relation extraction by extending the widely-used Shortest Path-enclosed Tree (SPT) to include necessary conte...

متن کامل

A Study on Dependency Tree Kernels for Automatic Extraction of Protein-Protein Interaction

Kernel methods are considered the most effective techniques for various relation extraction (RE) tasks as they provide higher accuracy than other approaches. In this paper, we introduce new dependency tree (DT) kernels for RE by improving on previously proposed dependency tree structures. These are further enhanced to design more effective approaches that we call mildly extended dependency tree...

متن کامل

Weisfeiler-Lehman Graph Kernels

In this article, we propose a family of efficient kernels for large graphs with discrete node labels. Key to our method is a rapid feature extraction scheme based on the Weisfeiler-Lehman test of isomorphism on graphs. It maps the original graph to a sequence of graphs, whose node attributes capture topological and label information. A family of kernels can be defined based on this Weisfeiler-L...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013